Performance Evaluation of Consistent Recovery Protocols Using MPICH-GF
نویسندگان
چکیده
This paper presents an implementation of several consistent recovery protocols at the abstract device level and their performance comparison We have performed experiments using three NAS Parallel Benchmark applications with class C datasets on state of the art equip ment The interesting result is that causal message logging protocol has the most expensive recovery cost with communication intensive applica tions since it su ers from concentrated overload of simultaneous message replaying Receiver based optimistic message logging has the least recov ery cost with drawback of extensive disk access overhead in failure free executions Coordinated checkpointing seems the most practical choice among them
منابع مشابه
MPICH-GF: Transparent Checkpointing and Rollback-Recovery for Grid-Enabled MPI Processes
Fault-tolerance is an essential element to the distributed system which requires the reliable computation environment. In spite of extensive researches over two decades, practical fault-tolerance systems have not been provided. It is due to the high overhead and the unhandiness of the previous fault-tolerance systems. In this paper, we propose MPICH-GF, a user-transparent checkpointing system f...
متن کاملMPICH-V Project: A Multiprotocol Automatic Fault-Tolerant MPI
High performance computing platforms like Clusters, Grid and Desktop Grids are becoming larger and subject to more frequent failures. MPI is one of the most used message passing library in HPC applications. These two trends raise the need for fault tolerant MPI. The MPICH-V project focuses on designing, implementing and comparing several automatic fault tolerance protocols for MPI applications....
متن کاملComparative Performance Analysis of AODV,DSR, TORA and OLSR Routing Protocols in MANET Using OPNET
Mobile Ad Hoc Networks (MANETs) are receiving a significant interest and are becoming very popular in the world of wireless networks and telecommunication. MANETs consist of mobile nodes which can communicate with each other without any infrastructure or centralized administration. In MANETs, the movement of nodes is unpredictable and complex; thus making the routing of the packets challenging....
متن کاملPorting MPICH ADI on GAMMA with Flow Control
The Genoa Active Message MAchine (GAMMA) is an experimental prototype of a light-weight communication system based on the Active Ports paradigm and designed for efficient implementation over low-cost Fast Ethernet interconnects. The original prototype implementation started in 1996 and obtained best performance by removing traditional communication protocols and implementing zerocopy send and r...
متن کاملEvaluation of MPI Implementations on Grid-connected Clusters using an Emulated WAN Environmen
The MPICH-SCore high performance communication library for cluster computing is integrated into the MPICHG2 library in order to adapt PC clusters to a Grid environment. The integrated library is called MPICH-G2/SCore. In addition, for the purpose of comparison with other approaches, MPICH-SCore itself is extended to encapsulate its network packet into a UDP packet so that packets are delivered ...
متن کامل